First I started by reading in all of the text files. This region is the mid-west, so I wanted to choose at least 1 article from each state to get a good representation. The states included in this region are North and South Dakota, Kansas, Nebraska, Minnesota, Iowa, Missouri, Wisconsin, Illinois, Indiana, Michigan, and Ohio.

Next I made frequency dataframes for words in the article, excluding stop words.

It was interesting to see the wide range of issues that were at the top of these frequency lists. Some talked a lot about politics and government, others farmland, forrests, and migration, and some talked about floods and other extreme weather.

Then I used afinn, nrc, and bing methods to get sentiment values for each word that appeared in the frequency tables.

## # A tibble: 2,477 × 2
##    word       value
##    <chr>      <dbl>
##  1 abandon       -2
##  2 abandoned     -2
##  3 abandons      -2
##  4 abducted      -2
##  5 abduction     -2
##  6 abductions    -2
##  7 abhor         -3
##  8 abhorred      -3
##  9 abhorrent     -3
## 10 abhors        -3
## # … with 2,467 more rows
## # A tibble: 13,875 × 2
##    word        sentiment
##    <chr>       <chr>    
##  1 abacus      trust    
##  2 abandon     fear     
##  3 abandon     negative 
##  4 abandon     sadness  
##  5 abandoned   anger    
##  6 abandoned   fear     
##  7 abandoned   negative 
##  8 abandoned   sadness  
##  9 abandonment anger    
## 10 abandonment fear     
## # … with 13,865 more rows
## # A tibble: 6,786 × 2
##    word        sentiment
##    <chr>       <chr>    
##  1 2-faces     negative 
##  2 abnormal    negative 
##  3 abolish     negative 
##  4 abominable  negative 
##  5 abominably  negative 
##  6 abominate   negative 
##  7 abomination negative 
##  8 abort       negative 
##  9 aborted     negative 
## 10 aborts      negative 
## # … with 6,776 more rows

Then I made tables of these sentiment values.

## 
## negative positive 
##       21        9
## 
## negative positive 
##        7       13
## 
## negative positive 
##       15        5
## 
## negative positive 
##        9       16
## 
## negative positive 
##       19       10
## 
## negative positive 
##       18       14
## 
## negative positive 
##       22       18
## 
## negative positive 
##       23       26
## 
## negative positive 
##       35        8
## 
## negative positive 
##       16        8
## 
## negative positive 
##       18       10
## 
## negative positive 
##       14        6
## 
## negative positive 
##        2        2
## 
## negative positive 
##        4        9
## 
## negative positive 
##       23       15
## 
##        anger anticipation      disgust         fear          joy     negative 
##           11           16            6           18           11           20 
##     positive      sadness     surprise        trust 
##           30            9            7           22
## 
##        anger anticipation      disgust         fear          joy     negative 
##            5            7            2            6            7           13 
##     positive      sadness     surprise        trust 
##           36            3            2           20
## 
##        anger anticipation      disgust         fear          joy     negative 
##            5           10            2           10            4           17 
##     positive      sadness     surprise        trust 
##           20            5            4           13
## 
##        anger anticipation      disgust         fear          joy     negative 
##            7           10            1            3           11            7 
##     positive      sadness     surprise        trust 
##           28            1            5           22
## 
##        anger anticipation      disgust         fear          joy     negative 
##           10           11            1           16            7           19 
##     positive      sadness     surprise        trust 
##           33           12            5           13
## 
##        anger anticipation      disgust         fear          joy     negative 
##            9           18            6           11           10           19 
##     positive      sadness     surprise        trust 
##           35            9            8           14
## 
##        anger anticipation      disgust         fear          joy     negative 
##           11           22            5           11           13           20 
##     positive      sadness     surprise        trust 
##           57            7            4           29
## 
##        anger anticipation      disgust         fear          joy     negative 
##            7           18            6            9           13           19 
##     positive      sadness     surprise        trust 
##           50            7            4           34
## 
##        anger anticipation      disgust         fear          joy     negative 
##           16           16           17           23            6           38 
##     positive      sadness     surprise        trust 
##           42           18            7           30
## 
##        anger anticipation      disgust         fear          joy     negative 
##            7           11            3           10            4           20 
##     positive      sadness     surprise        trust 
##           22            7            4           17
## 
##        anger anticipation      disgust         fear          joy     negative 
##            8           12            1           14            7           19 
##     positive      sadness     surprise        trust 
##           34           12            5           14
## 
##        anger anticipation      disgust         fear          joy     negative 
##            5           13            3            9            8           14 
##     positive      sadness     surprise        trust 
##           23            4            3           17
## 
##        anger anticipation      disgust         fear          joy     negative 
##            3            8            1            7            3           11 
##     positive      sadness     surprise        trust 
##           23            7            3           14
## 
##        anger anticipation         fear          joy     negative     positive 
##            3           10            3            6            8           28 
##      sadness     surprise        trust 
##            2            1           19
## 
##        anger anticipation      disgust         fear          joy     negative 
##           11           16            4           16           11           25 
##     positive      sadness     surprise        trust 
##           37           12            4           27
## 
## -3 -2 -1  1  2  3 
##  4  6  5  6  4  1
## 
## -2 -1  1  2  3 
##  4  3  9  7  1
## 
## -3 -2 -1  1  2 
##  3  7  4  7  1
## 
## -2  1  2  3 
##  3 14 12  2
## 
## -3 -2 -1  1  2 
##  2 10  6  4  3
## 
## -3 -2 -1  1  2  3  4 
##  5 12  1  7  2  1  1
## 
## -3 -2 -1  1  2  3 
##  2  7  7 14  5  1
## 
## -3 -2 -1  1  2  3  4 
##  2  7  3 11  9  1  1
## 
## -3 -2 -1  1  2  3 
##  5 13  9  9  7  1
## 
## -3 -2 -1  1  2  3 
##  1  9  4 11  2  1
## 
## -3 -2 -1  1  2 
##  2 10  6  5  3
## 
## -3 -2 -1  1  2 
##  2  3  5  5  4
## 
## -3 -2 -1  1  2  3 
##  1  2  4  4  2  1
## 
## 1 2 
## 3 9
## 
## -3 -2 -1  1  2 
##  3 14  4  8  5

For the bing tables, words were either given a positive or negative rating. Most of the articles I chose had slightly more negative values than positive ones. The only states with articles that had more positive word ratings were North and South Dakota, Indiana, and Michigan. And the only state with a heavy skew was Minnesota, with 35 negative and only 8 positive.

Then I made word clouds to look at frequency again.

I don’t think I came to any more conclusions from the wordclouds I made than from just looking at the frequency tables. It was easier to see the difference in frequencies within one article given the differing size of the text, but they didn’t help much to compare between articles.

Finally I made the tf_idf dataframe.

Some words I found with the highest tf_idf value were “forest”, “agriculture”, “education”, and “organic”

Next is the articles for the south west region.

I started by reading in all of the articles I chose. This region only had 4 states in it, so I chose 4 articles from each state, and for each state I used the same newspaper. The states included in this region are Texas, Oklahoma, Arizona, and New Mexico.

Next I made frequency tables of each article, excluding the stop words.

Most of these articles had high frequency words about the economy, and what seemed to economic impacts of climate change. Only a few had words about politics and government, emissions and chemicals, or weather.

Then I used afinn, nrc, and bing to get sentiment values for each word in the frequency tables.

## # A tibble: 2,477 × 2
##    word       value
##    <chr>      <dbl>
##  1 abandon       -2
##  2 abandoned     -2
##  3 abandons      -2
##  4 abducted      -2
##  5 abduction     -2
##  6 abductions    -2
##  7 abhor         -3
##  8 abhorred      -3
##  9 abhorrent     -3
## 10 abhors        -3
## # … with 2,467 more rows
## # A tibble: 13,875 × 2
##    word        sentiment
##    <chr>       <chr>    
##  1 abacus      trust    
##  2 abandon     fear     
##  3 abandon     negative 
##  4 abandon     sadness  
##  5 abandoned   anger    
##  6 abandoned   fear     
##  7 abandoned   negative 
##  8 abandoned   sadness  
##  9 abandonment anger    
## 10 abandonment fear     
## # … with 13,865 more rows
## # A tibble: 6,786 × 2
##    word        sentiment
##    <chr>       <chr>    
##  1 2-faces     negative 
##  2 abnormal    negative 
##  3 abolish     negative 
##  4 abominable  negative 
##  5 abominably  negative 
##  6 abominate   negative 
##  7 abomination negative 
##  8 abort       negative 
##  9 aborted     negative 
## 10 aborts      negative 
## # … with 6,776 more rows
## 
## negative positive 
##       56       40
## 
## negative positive 
##       78       56
## 
## negative positive 
##       39       18
## 
## negative positive 
##       21       19
## 
##        anger anticipation      disgust         fear          joy     negative 
##           25           46           14           31           31           63 
##     positive      sadness     surprise        trust 
##          107           16           22           70
## 
##        anger anticipation      disgust         fear          joy     negative 
##           22           49           20           31           32           69 
##     positive      sadness     surprise        trust 
##          101           32           23           64
## 
##        anger anticipation      disgust         fear          joy     negative 
##           18           25            9           23           12           42 
##     positive      sadness     surprise        trust 
##           51           16           13           35
## 
##        anger anticipation      disgust         fear          joy     negative 
##            7            9            6           13            7           25 
##     positive      sadness     surprise        trust 
##           29            9            3           12
## 
## -3 -2 -1  1  2  3  4 
##  7 19 15 19 16  2  1
## 
## -3 -2 -1  1  2  3  4 
## 10 28 16 25 23  8  3
## 
## -3 -2 -1  1  2  4 
##  8 15 14  9  7  1
## 
## -3 -2 -1  1  2  3  4 
##  2  7  5  7  5  1  1
## 
## negative positive 
##       23       22
## 
## negative positive 
##       13        9
## 
## negative positive 
##       14        7
## 
## negative positive 
##       20       14
## 
##        anger anticipation      disgust         fear          joy     negative 
##            8           19            2           11           12           24 
##     positive      sadness     surprise        trust 
##           55            7            7           36
## 
##        anger anticipation      disgust         fear          joy     negative 
##            7           14            2            5           12           17 
##     positive      sadness     surprise        trust 
##           46            2            1           34
## 
##        anger anticipation      disgust         fear          joy     negative 
##            5            9            4           15            8           19 
##     positive      sadness     surprise        trust 
##           24            5            5           23
## 
##        anger anticipation      disgust         fear          joy     negative 
##           13           20            9           15           12           27 
##     positive      sadness     surprise        trust 
##           34           10           11           31
## 
## -4 -3 -2 -1  1  2  3 
##  1  6  6  7 14  9  2
## 
## -3 -2 -1  1  2  3 
##  2  5  4 11  7  1
## 
## -3 -2 -1  1  2 
##  1  4  2  7  5
## 
## -3 -2 -1  1  2 
##  2  6  3  5  8
## 
## negative positive 
##       23       10
## 
## negative positive 
##       20       18
## 
## negative positive 
##        5        5
## 
## negative positive 
##       18       10
## 
##        anger anticipation      disgust         fear          joy     negative 
##            6           10            2            9            6           23 
##     positive      sadness     surprise        trust 
##           27            7            4           14
## 
##        anger anticipation      disgust         fear          joy     negative 
##           10           16            2           13           11           23 
##     positive      sadness     surprise        trust 
##           43            6            6           23
## 
##        anger anticipation      disgust         fear          joy     negative 
##            4            9            4            5            3            8 
##     positive      sadness     surprise        trust 
##           24            2            2           20
## 
##        anger anticipation      disgust         fear          joy     negative 
##            7            9            3            7            7           15 
##     positive      sadness     surprise        trust 
##           29            3            4           19
## 
## -3 -2 -1  1  2  3  4 
##  2  5  1  7  4  2  2
## 
## -3 -2 -1  1  2 
##  1  7  6  7  9
## 
## -3 -2 -1  1  2 
##  1  1  4  6  5
## 
## -3 -2 -1  1  2  3 
##  2  2  6 10  3  1
## 
## negative positive 
##       20       18
## 
## negative positive 
##       10        3
## 
## negative positive 
##       20       20
## 
## negative positive 
##       29       13
## 
##        anger anticipation      disgust         fear          joy     negative 
##           10           23            5           15           13           27 
##     positive      sadness     surprise        trust 
##           49           11            9           28
## 
##        anger anticipation      disgust         fear          joy     negative 
##            6            9            3            6            9           13 
##     positive      sadness     surprise        trust 
##           23            4            6           13
## 
##        anger anticipation      disgust         fear          joy     negative 
##           10           12            7           19           11           31 
##     positive      sadness     surprise        trust 
##           34           13            4           19
## 
##        anger anticipation      disgust         fear          joy     negative 
##           13           13            9           23            8           35 
##     positive      sadness     surprise        trust 
##           31           21            8           18
## 
## -3 -2 -1  1  2  3 
##  2  7  6  8  7  2
## 
## -3 -2 -1  1  2 
##  1  6  5  5  4
## 
## -3 -2 -1  1  2  3 
##  2 10  4  7 13  1
## 
## -4 -3 -2 -1  1  2 
##  1 10 12  6  8  9

For these tables, I wanted to compare the trends within each state. In the afinn tables, I found Texas to be very right skewed, with much more values in the negative numbers than positive. Oklahoma was the only state with an article that had more positive numbers than negative, and the Arizona and New Mexico ones followed a normal distribution more or less.

The bing tables showed about the same thing, except there was no article with more postive values than negative, so it is surpising that the one from Oklahoma as mentioned before had higher afinn scores.

Then I made histograms for the afinn values from each article. I think I was able to understand this data just from the table, and these plots were unnecessary for me.

Then I made wordclouds for the frequency tables.

Like the histograms, I’m not sure how much these helped me, as it wasn’t easy to compare each wordcloud to the other ones. But I could see any words in each article that appeared many more times than others, since they were printed in a bigger size. It was easier to understand the scale of each word with the wordcloud than the frequency table.

Finally I made the tf_idf dataframe.

Some words I found with the highest tf_idf values were “water”, “plant”, “energy”, and “oil”. I think this is an interesting difference between words like “education” and “agriculture” from the midwest region I looked at. It seems like the south region is more concerned with the root of the problem and possible solutions, where the midwest region is concerned with affects climate change could have.